With the avalanche of biological sequences in public databases, one of the most challenging\nproblems in computational biology is to predict their biological functions and cellular attributes.\nMost of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore,\nit is important to be able to represent biological sequences with various lengths using fixed-length\nnumerical vectors. Although several algorithms, as well as software implementations, have been\ndeveloped to address this problem, these existing programs can only provide a fixed number of\nrepresentation modes. Every time a new sequence representation mode is developed, a new program\nwill be needed. In this paper, we propose the UltraPse as a universal software platform for this\nproblem. The function of the UltraPse is not only to generate various existing sequence representation\nmodes, but also to simplify all future programming works in developing novel representation\nmodes. The extensibility of UltraPse is particularly enhanced. It allows the users to define their own\nrepresentation mode, their own physicochemical properties, or even their own types of biological\nsequences. Moreover, UltraPse is also the fastest software of its kind. The source code package,\nas well as the executables for both Linux and Windows platforms, can be downloaded from the\nGitHub repository.
Loading....